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Abstract 

The use of technology, such as computer-assisted language learning (CALL), is used in teaching and learning in the 
foreign language classrooms where it is most needed. One promising emerging technology that supports language 
learning is automatic speech recognition (ASR). Integrating such technology, especially in the instruction of 
pronunciation in the classroom, is important in helping students to achieve correct pronunciation. In Iraq, English is a 
foreign language, and it is not surprising that learners commit many pronunciation mistakes. One factor contributing to 
these mistakes is the difference between the Arabic and English phonetic systems. Thus, the sound transformation from 
the mother tongue (Arabic) to the target language (English) is one barrier for Arab learners. The purpose of this study is 
to investigate the effectiveness of using automatic speech recognition ASR EyeSpeak software in improving the 
pronunciation of Iraqi learners of English. An experimental research project with a pretest-posttest design is conducted 
over a one-month period in the Department of English at Al-Turath University College in Baghdad, Iraq. The ten 
participants are randomly selected first-year college students enrolled in a pronunciation class that uses traditional 
teaching methods and ASR EyeSpeak software. The findings show that using EyeSpeak software leads to a significant 
improvement in the students’ English pronunciation, evident from the test scores they achieve after using EyeSpeak 
software. 
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1. Introduction 

The increasing demand for technology use in everyday life serves many purposes, one of which being in education as a 
way to facilitate teaching and learning. It is necessary to use technology, especially in the foreign language classroom 
where there is limited time for exposure to and practice of the target language. Therefore, language learners need to 
listen to and practice using the target language more often in a stress-free environment. The important factor in learning 
any language is being able to speak that language in an intelligible way. In helping students achieve this, using 
classroom-based technology such as automatic speech recognition (ASR) to teach English pronunciation has a positive 
impact on an individual’s outcomes and performance (Chapella, 2001). It provides authentic material, such as native 
speakers’ pronunciation of the target language, and at the same time allows the students to listen to and practice their 
pronunciation in an enjoyable setting; it also gives each individual learner immediate corrections and feedback, which is 
difficult to achieve in class with a large number of students. 

The increased use of automatic speech recognition (ASR) is now an important element in teaching pronunciation. Many 
studies recommend its use as essential to the process due to the advantages it offers learners (Chapelle, 2001; Butler- 
Pascoe & Wiburg, 2003; Neri, Cucchiarini, & Strik, 2001; Harless, Zier, & Duncan, 1999; Kim, 2006; Pennington, 
1996; McCrocklin, 2014). For instance, ASR technology gives the instructor the opportunity to discover each individual 
learner’s problems with pronunciation. Furthermore, using automatic speech recognition (ASR) provides each student 
the chance to practice pronunciation, identify mistakes and receive feedback from a native speaker. It provides a stress- 
free environment that encourages the students to speak the target language and motivates them to participate (Morley, 
1991). All of these advantages assist the students in the process of learning English pronunciation. Ultimately, it helps 
them improve their pronunciation and their overall oral skills. 

Furthermore, clear and accurate pronunciation will lead to better understanding and make communication easier, 
whereas poor pronunciation can mislead the listener and make comprehension difficult (Eskenazi, 1999). Learners’ 
poor pronunciation is a barrier to speaking the target language effectively; therefore, the learners focus much attention 
on attempts to master pronunciation (Fraser, 1999). There are several reasons for learners’ poor pronunciation, which 
include mother tongue interference and phonetic system differences (Flege, 1995). This study concentrates on the 
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differences between the Arabic and English phonetic systems, especially those English sounds that do not exist in the 
Arabic sound system. While the Arabic phonetic system includes 32 consonants sounds and 8 vowel sounds, the 
English equivalent has 24 consonant sounds and 22 vowel sounds (Abdou et al., 2014, p. 371). In addition, such 
differences between the mother tongue and the target language are considered problems for learners, especially in 
learning pronunciation (Bell, 1995). The interference of the mother tongue (Arabic) on the foreign language (English) is 
considered one of the factors that cause problems for Arab learners in general, as is the case for Iraqi learners, which 
lead them to have difficulty in mastering and producing accurate English pronunciation. This issue hinders 
communication in the target language and discourages them from practicing and speaking English. Therefore, this study 
investigates whether using pronunciation training software may help learners to improve their English pronunciation 
despite differences between the Arabic and English phonetic systems. The main issue relates to the English sounds that 
do not exist in the Arabic phonetic system. Therefore, teachers should make the students aware of those differences and 
help them overcome their pronunciation errors. A successful learner requires proper training in those differences. 
According to Kenworthy (1987, p. 4), six factors affect pronunciation accuracy. First are the phonetic system 
differences between the mother tongue and the target language. The second is the learner’s age, as younger learners 
leam faster that adults. Third is the amount of exposure to the target language. Fourth is the learner’s phonetic ability, 
which allows him to discriminate between sounds. The fifth factor is the learner’s attitude and identity, and sixth is the 
learner’s motivation and desire to produce good pronunciation. Three of these factors can be addressed using automatic 
speech recognition (ASR) software, which can provide the help students need in successful pronunciation learning: (a) 
it provides students with several training exercises and drills that make them aware of the sound differences between the 
mother tongue and the target language; (b) it offers learners exposure to the target language; and (c) it provides practice 
activities, correction and feedback that will enable the students to discriminate between the sound differences. Thus, 
incorporating computer-assisted language learning (CALL) software in the classroom can improve students’ 
pronunciation, as using automatic speech recognition (ASR) software to teach pronunciation will provide the students 
with authentic materials and activities by listening to native speakers, identifying students’ pronunciation problems and 
providing correction and feedback. Therefore, this will assist the students’ learning process, lead them to produce 
accurate English pronunciation and help them become independent learners. Students’ practising and completing drills 
by themselves also saves the teacher time (Kenworthy, 1987). 

1.1 The Purpose of the Study 

The purpose of the study is to investigate the effectiveness of computer-assisted ASR software in teaching English 
pronunciation to Iraqi students in the Department of English Language at Al-Turath University College, Baghdad, Iraq. 
Integrating automatic speech recognition (ASR) EyeSpeak software may help students improve their English 
pronunciation. Taking advantage of the software’s features, such as drills, correction and feedback, may help students 
reduce pronunciation errors related to the transfer of Arabic sounds to their English speech production. In addition, 
implementing automatic speech recognition (ASR) software in the teaching and learning environment will provide 
learners with examples of authentic pronunciation by native English speakers. In this study, EyeSpeak software is used 
for one month; improvements in students’ pronunciation are measured by administering pretests and posttests to 
evaluate their pronunciation proficiency levels before and after using the software. 

2. Literature Review 

Many studies in the field of language teaching and technology attempt to find the most effective aids for improving 
students’ pronunciation, and the aim of any teacher is to help the students pronounce accurate, intelligible and native¬ 
like pronunciation. The traditional way of teaching pronunciation is usually to concentrate on the comparative 
evaluation of how the students’ speech production compares to native speaker pronunciation (Molholt, 1988). However, 
correcting all of the students’ mistakes in the traditional class is difficult to achieve, as it is time consuming. Therefore, 
many teachers try to implement technology aids to assist learning, as giving corrections and feedback to students is 
essential to ensure their pronunciation accuracy. When the learner’s pronunciation is accurate, it allows for a 
spontaneous conversation and makes the flow of communication easier (Pennington, 1996). 

2.1 Teaching Pronunciation and Theories in Language Learning 

Several studies demonstrate the effectiveness of pronunciation training on students’ pronunciation performance. Morley 
(1994) recommends paying more attention to pronunciation instruction as a new trend in teaching pronunciation, while 
Denying, Munro and Wiebe (1998) state that instruction in segmental accuracy and general oral habits leads to 
enhanced pronunciation. In addition, a focus on new instructional plans should involve consideration of not only 
language forms and functions, but also issues of learner involvement and learner training techniques. Student 
involvement in learning leads them to become active learners in the sense that they can develop and modify their speech 
production. However, teachers must be aware of the possibilities that technology offers learners, as this can increase 
their understanding of language learning. 

Having students practice pronunciation is important in helping them improve their speaking production and increase 
their self-confidence, and can make them less hesitant to speak the target language. Students’ self-esteem plays a 
significant role in improving their English pronunciation because learning pronunciation is not only a matter of 
exposure to native speakers, but also relates to practising the target language themselves (Kenworthy, 1987). 

Many studies address the variables that lead to successful pronunciation learning. Vitanova and Miller (2002) indicate 
that teachers may provide learning strategies that raise awareness of the differences in the phonetic systems between the 
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mother tongue and the target language, which will help students master the target language pronunciation (Vitanova & 
Miller, 2002). An Oxford study (1986b) highlights the important role of the teacher in providing learning strategies to 
enhance students’ pronunciation performance. 

2.2 The Importance of Using Pronunciation Training in the Classroom 

In teaching English as a Foreign Language (EFL), the most important aspect teachers should focus on is helping the 
students to master the pronunciation. Students’ poor pronunciation leads to communication failure and learners 
suffering from low self-esteem and stress (Morley, 1998). 

Using automatic speech recognition (ASR) can help the students to identify the differences between the sounds of their 
mother tongue and the target language (Richards, 2015). The students will be able to listen to the native speaker 
pronunciation model, compare it with their own pronunciation, practice and receive immediate private feedback, which 
will encourage them to engage in repetitive practices (Richards, 2015). One of the factors that cause Arab learners to 
face difficulty in learning English pronunciation is the influence of their mother tongue on the target language. The 
Arabic language interfering with the English language affects the Arab learners’ pronunciation of English. Similarly, 
various researchers (Celce-Murcia, Brinton, & Goodwin, 1996; Pennington, 1994) note that the mother tongue can 
greatly influence the target language, which affects students’ pronunciation in terms of intonation and the production of 
vowel or consonant sounds. 

Therefore, it is important that teachers of pronunciation integrate technology aids to help students improve. The teacher 
can provide individual students with the opportunity to practice their pronunciation by using automatic speech 
recognition (ASR) to help them overcome their mistakes as it offers individual practice, correction and feedback, which 
is difficult to accomplish in traditional classes because it is time consuming. Moreover, integrating automatic speech 
recognition (ASR) is particularly necessary in a foreign language classroom as it allows individual students to gain 
awareness of their pronunciation issues. With the teacher’s guidance and the support of ASR software, each student can 
overcome pronunciation issues related to the influence of their mother tongue on the target foreign language. It gives 
them the opportunity to listen to native speakers and practice the drills needed to help them improve their target 
language pronunciation. 

2.3 The Usefulness of Pronunciation Training Feedback 

In traditional classes, the teacher corrects the students’ pronunciation mistakes immediately, a technique based on the 
audio-lingual teaching method. However, in the early 1970s, according to the communicative approach, it was felt that 
teachers should not correct students’ errors immediately. Krashen (1985) explains that immediate correction may make 
the students feel uncomfortable, lose their self-confidence and refuse to participate in further activities. Carroll and 
Swain (1993) review various types of correction and feedback used to improve students’ English performance, and 
show that second language learners who receive feedback perform better than students who do not have their errors 
corrected. As there is no recognised effective measure concerning the best way to correct students’ pronunciation, 
feedback is considered the most successful, even though the students first need to understand the feedback in order to 
make the changes required to improve their pronunciation. Al-Qudah (2012) mentions that the focus in teaching 
pronunciation in the foreign language classroom should be on having students produce native-like pronunciation, while, 
in teaching pronunciation, the focus should be on sound production and knowing the place of sound articulation (p. 
202 ). 

In traditional pronunciation classes, it is usually difficult for the teacher to address all the students’ English 
pronunciation performances and problems. Thus, technology applications are widely used in language teaching because 
of the benefits for individual learners. However, using automatic speech recognition (ASR) in the pronunciation 
classroom can provide correction and feedback for all individual students in a private, stress-free environment, as each 
learner works independently with the software. Therefore, many teachers in foreign language classrooms implement the 
use of technology in teaching English pronunciation (Neri, Strik, & Cucchiarini, 2006). Automatic speech recognition, 
which addresses individual learners and can identify their pronunciation errors, is commonly used as an aid in teaching 
pronunciation (Truong, Neri, Dewet, Cucchiarini, & Strik, 2005), and ASR software with internet features is considered 
most effective in improving the teaching and learning of pronunciation. There are different types of ASR software, 
some CD-based and others internet-based; we use EyeSpeak in this study, which is an internet-based program. 
According to Witt (2012), EyeSpeak has a useful feature in terms of providing the students with feedback in an 
interesting way, as it provides details of tongue position as compared with the native-speaker model. It gives the 
students feedback in the form of an animated sound wave, phonetic transcription and animated sound production, and 
provides them with an overall score for their pronunciation performance. Moreover, it provides students with details 
about segmental features, including consonant and vowel phonemes, and suprasegmental features, including the 
pronunciation aspects of timing, pitch and loudness. In addition to these features, which increase teachers’ interest in 
using the software in pronunciation classes, the software introduces subject materials in a colourful and interesting 
environment that attracts students’ attention (Hi$manoglu, 2010). 

3. Methodology 

In recent decades, research has addressed the implementation of automatic speech recognition (ASR) EyeSpeak 
software in the foreign language classroom because it can help in achieving native-like pronunciation. The EyeSpeak 
software can detect and diagnose students’ errors and provide automatic feedback, which raises awareness of their 
pronunciation errors. Therefore, the aim of the study is to investigate the effect of using the software on the 
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pronunciation performance of Iraqi EFL students, and to determine whether teaching pronunciation using ASR is more 
efficient than using traditional methods. To test this, we conduct experimental research in this pilot study to answer the 
following research question: Is there any effect of using automatic speech recognition (ASR) EyeSpeak software on 
Iraqi students’ English pronunciation? 

In an experiment with one group, we utilise a pretest-posttest method to investigate students’ pronunciation 
performance by considering differences in test scores before and after the use of EyeSpeak in teaching English 
pronunciation. We conduct the pilot study at a private university in Baghdad, Iraq; the participants are ten first-year 
college students, randomly selected from the Department of English, aged between sixteen and twenty-one. 

4. Instrument 

The pronunciation teaching material is the textbook Better Pronunciation by J. D. O’Connor (2003), as assigned by 
Iraq’s Ministry of Higher Education. This book introduces the pronunciation of English to students at the intermediate 
and advanced levels. It explains how the speech organs work, and also considers separate sounds before blending them 
into words, rhythm patterns and intonation. 

The pilot study adopts EyeSpeak software as a multimedia pronunciation-teaching tool that includes speech recognition, 
online-based pronunciation features and sound-distinction training. EyeSpeak includes drills, practice, speech recording 
with playback capabilities, sound and graphic articulatory displays and animated and visual pronunciation feedback. 
These features enable evaluation and provide visual feedback for English as a foreign language (EFL) learners of 
English language sounds (vowels and consonants). This software can be a valuable pronunciation tool for beginner to 
intermediate English as a foreign language (EFL) learners to help them distinguish the differences between English 
sounds and accurately produced English phonemes. Several features of EyeSpeak have the potential to assist EFL 
learners in resolving their segmental pronunciation problems, and may help them improve their pronunciation. These 
features include showing animated speech organs (sound production), viewing the sound production waveform, 
listening to native speakers to compare with the students’ pronunciation and listening to minimal pairs containing target 
sounds with transcription. 

The software consists of two user profiles, one for the teacher and one for the students. The purpose of the teacher 
profile is to monitor students’ progress; it enables the teacher to observe the individual learner’s skill level and score. 

The student profile consists of five sections: home, lesson, speech, dictionary and fun. 

The focus of the pilot study is on learning pronunciation, and the aforementioned speech section assists students in 
learning and practising English pronunciation. To measure any improvement in English pronunciation due to using 
EyeSpeak software, we use an achievement test. We administer a pretest and posttest to measure whether there is a 
significant difference between students’ scores before the exposure (pretest) and after exposure (posttest) to EyeSpeak. 
The main goal is to measure students’ improvement in pronouncing the English consonant sounds not found in the 
Arabic phonetic system (/p/, /v/, /tJ7, 1^1, /q/). The test consists of a written part and an oral element. The test questions 
are adapted from those in English Pronunciation in Use by Marks (2007), while the test also utilises content from Better 
Pronunciation by O’Connor (2003). On the written test, the students have to answer all 11 transcription and multiple- 
choice questions intended to measure specific sounds. In the 48-word oral test, each student must read four pairs of 
words that contain two consonant sounds. The content of the test is verified by two senior lecturers. The analysis 
technique used in this study is a paired-samples f-test to measure differences between the pretest and posttest scores. 

5. Data Collection Procedure 

This four-week pilot study began on January 10 and ended on February 10, 2016. There were three 45-minute classes 
per week for the pronunciation subject. Each week, the students attended one class in the sound lab using EyeSpeak 
software. The students took a 90-minute pretest on January 10 at 10 a.m. to measure their pronunciation proficiency 
level before using the EyeSpeak software, with the test directed by their pronunciation class teacher. In the next day’s 
lab-based pronunciation class, the lecturer explained, in detail, how to use the EyeSpeak software’s sections and 
categories. The lecturer verified that all students understood how to use the software before the actual class session 
started. In the following days, after the pilot study began, students attended lab-based pronunciation classes. Each 
student had a computer, an EyeSpeak account and his or her own headset. Students had to log in to their accounts each 
time they came to the lab; the lecturer then introduced them to the sounds they had to work on that day using particular 
EyeSpeak activities. At 10 a.m. on February 10, at the end of the study, the students took a 90-minute posttest. 

6. Data Analysis 

A paired-samples t-test determined whether the difference between the pretest and posttest was significantly different 
from zero, and a Shapiro-Wilk test determined whether the difference could have been produced by a normal 
distribution (Razali & Wah, 2011). The results of the Shapiro-Wilk test are not significant (W= 0.88, p = .123), which 
suggests that the deviations from normality are explainable by random chance; thus, normality can be assumed. 
Levene’s test was applied to assess if the homogeneity of variance assumption was met (Levene, 1960), which requires 
that the variance of the dependent variable be approximately equal in each student. The result of Levene’s test is not 
significant (F (1, 20) = 1.93 , p = .180), indicating that the assumption of homogeneity of variance is met. 

The result of the paired samples t-test is significant (t (9) = -11.22, p < .001), suggesting that the true difference in the 
pretest and posttest means is significantly different from zero. The mean of the pretest (M= 33.10) is significantly lower 
than that of the posttest (M = 44.21). Table 1 presents the results of the paired samples t-test; Figure 1 presents the 
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Table 1. Paired-samples t-test for the difference between pretest and posttest 


Pretest 

Posttest 

t-test 

Probability value 

Cohen’s d 

M 

SD 

M 

SD 

t 

P 

d 

33.10 

7.36 

44.21 

21.69 

- 11.22 

<.001 

0.69 


Note. Degrees of freedom for the t-statistic = 9. d represents Cohen’s d. 



Figure 1. The means of the pronunciation pretest (A) and posttest (B) 


7. The Findings 

The findings of this study indicate a significant improvement in students’ pronunciation after using EyeSpeak software 
for a one-month period. The study focuses on the English sounds absent from the phonetic system of their mother 
tongue (Arabic), and the students’ test scores reveal an improvement in these sounds. The findings show a great 
improvement in the sounds /p/, hi, /}/, a slight improvement in 1^1, /tj/, M 3 / and less improvement in It)/. A similar study 
by Mohsin (2012) reveals significant improvement in students’ pronunciation (individual sounds) after using CALL- 
based ASR software. Moreover, the conclusion of the pilot study confirms that using ASR EyeSpeak software did 
improve Iraqi students’ pronunciation of English sounds absent from their mother tongue (Arabic). Therefore, we 
recommend implementing CALL ASR in teaching pronunciation, given that it leads to student improvement. 
Furthermore, a study by Eskenazi, Tomokiyo and Wang (2000) reveals that using CALL pronunciation training 
software is valuable in improving the students’ pronunciation of difficult English sounds. On the contrary, a study by 
Witt (2012) reveals that using EyeSpeak software may not help the students to understand the feedback as it does not 
give a score for their phoneme level, only their word level. Other studies (Kim, 2006; Neri, Cucchiarini, & Strik, 2008) 
affirm that ASR is effective in teaching pronunciation, especially for non-native speaker students, as it significantly 
improves the students’ pronunciation. 

8. Conclusion 

The objective of this study is to help Iraqi college students improve their English pronunciation by focusing on the 
English consonant sounds not found in the Arabic phonetic system. Differences in the phonetics systems affect the 
students’ pronunciation of the English language due to them transferring their Arabic pronunciation to their English 
pronunciation. Analysis of the study’s results reveals that there is a significant improvement in students’ English 
pronunciation in the posttest compared with their pretest scores. This difference indicates that the use of EyeSpeak 
software in the pronunciation class helps students in their learning process and leads them to produce more accurate 
English pronunciation. 

Thus, the use of EyeSpeak software in a pronunciation class can improve students’ English pronunciation and help them 
to learn more quickly and realise their errors. Therefore, we recommend the use of automatic speech recognition (ASR) 
EyeSpeak software as a tool to support teaching English, especially for EFL learners. Integrating EyeSpeak software 
can augment the students’ ability to learn and increase their level of understanding. 
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